Slovak Dataset for Multilingual Question Answering
نویسندگان
چکیده
SK-QuAD is the first manually annotated dataset of questions and answers in Slovak. It consists more than 91k factual from various fields. Each question has an answer marked corresponding paragraph. also contains negative examples form "unanswered questions" "plausible answers". The published free charge for scientific use. We aim to contribute creation Slovak or multilingual systems generating a natural language. paper provides overview existing datasets answering. describes annotation process statistically analyzes created content. expands possibilities training evaluation language models. Experiments show that achieves state-of-the-art results improves answering other languages zero-shot learning. compare effect machine-translated data with annotated. Additional improve modeling low-resourced languages.
منابع مشابه
Learning to Translate for Multilingual Question Answering
In multilingual question answering, either the question needs to be translated into the document language, or vice versa. In addition to direction, there are multiple methods to perform the translation, four of which we explore in this paper: word-based, 10-best, contextbased, and grammar-based. We build a feature for each combination of translation direction and method, and train a model that ...
متن کاملAre You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
In this paper, we present the mQA model, which is able to answer questions about the content of an image. The answer can be a sentence, a phrase or a single word. Our model contains four components: a Long Short-Term Memory (LSTM) to extract the question representation, a Convolutional Neural Network (CNN) to extract the visual representation, an LSTM for storing the linguistic context in an an...
متن کاملSQuAD Question Answering Dataset: CS224N Assn 4
We solve the contextual question answering problem, which is an essential part in many automated question-answering datasets. Recently the SQuAD dataset [1] was uploaded and there were several deep learning approaches proposed to solve this. We implement a modified version of one of them, the Dynamic Coattention model as well as simple baseline.
متن کاملQuestion Answering on the SQuAD Dataset
We develop a deep learning framework for question answering on the Stanford Question Answering Dataset (SQuAD), blending ideas from existing state-of-theart models to achieve results that surpass the original logistic regression baselines. Using a dynamic coattention encoder and an LSTM decoder, we achieved an F1 score of 55.9% on the hidden SQuAD test set. In this paper, we present the methodo...
متن کاملMultilingual Question/Answering: the DIOGENE System
This paper presents the DIOGENE question/answering system developed at ITCIrst. The system is based on a rather standard architecture which includes three components for question processing, search and answer extraction. Linguistic processing strongly relies on MULTIWORDNET, an extended version of the English WORDNET. The system has been designed to address two promising directions: multilingua...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2023
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2023.3262308